<html>
<head><meta charset="utf-8"><title>Allocation wrapping around the address space · t-lang/wg-unsafe-code-guidelines · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/index.html">t-lang/wg-unsafe-code-guidelines</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html">Allocation wrapping around the address space</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="245321001"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245321001" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Elichai Turkel <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245321001">(Jul 08 2021 at 14:29)</a>:</h4>
<p>The safety docs for pointers say the following sentence a lot:</p>
<blockquote>
<p>The offset being in bounds cannot rely on “wrapping around” the address space</p>
</blockquote>
<p>How do I make sure of that though? Say I got some buffer (pointer+length) from C code, can I assume it's contingous and doesn't wrap around the address space? (assuming I don't know which allocator does the C code use, doesn't know the target architecture and can't read its code).</p>
<p>Should I add some assert to check this? if so, how? (I have a feeling <code>assert!((ptr as usize).checked_add(size).unwrap() &lt;= isize::MAX as usize)</code> is somewhat UBish)</p>



<a name="245321364"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245321364" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Elichai Turkel <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245321364">(Jul 08 2021 at 14:31)</a>:</h4>
<p>I'm not sure if <code>wrapping_add</code> solves it because it says it basically has the same rules but just deferred to the time of dereferencing?</p>



<a name="245322638"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245322638" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Alexander Huszagh (He/Him) <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245322638">(Jul 08 2021 at 14:40)</a>:</h4>
<p><span class="user-mention silent" data-user-id="249222">Elichai Turkel</span> <a href="#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space/near/245321001">said</a>:</p>
<blockquote>
<p>The safety docs for pointers say the following sentence a lot:</p>
<blockquote>
<p>The offset being in bounds cannot rely on “wrapping around” the address space</p>
</blockquote>
<p>How do I make sure of that though? Say I got some buffer (pointer+length) from C code, can I assume it's contingous and doesn't wrap around the address space? (assuming I don't know which allocator does the C code use, doesn't know the target architecture and can't read its code).</p>
<p>Should I add some assert to check this? if so, how? (I have a feeling <code>assert!((ptr as usize).checked_add(size).unwrap() &lt;= isize::MAX as usize)</code> is somewhat UBish)</p>
</blockquote>
<p>If you have a buffer that's contiguous in memory, then you can safely add up to the <a href="https://doc.rust-lang.org/std/primitive.pointer.html#method.add">length</a> without UB.</p>
<blockquote>
<p>If any of the following conditions are violated, the result is Undefined Behavior:<br>
 -  Both the starting and resulting pointer must be either in bounds or one byte past the end of the same allocated object.<br>
 -  The computed offset, in bytes, cannot overflow an isize.<br>
 -  The offset being in bounds cannot rely on “wrapping around” the address space. That is, the infinite-precision sum must fit in a usize.</p>
</blockquote>
<p>If it's contiguous in memory, then none of these can be violated, I believe. But you really have to know it's contiguous, which any standard allocator should do. <code>allocator_traits::allocate</code> must assign contiguous memory for newer C++ <a href="https://en.cppreference.com/w/cpp/memory/allocator_traits/allocate">standards</a>:</p>
<blockquote>
<p>Alloc::allocate was not required to create array object until P0593R6, which made using non-default allocator for std::vector and some other containers not actually well-defined according to the core language specification.</p>
<p>After calling allocate and before construction of elements, pointer arithmethic of Alloc::value_type* is well-defined within the allocated array, but the behavior is undefined if elements are accessed.</p>
</blockquote>



<a name="245322813"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245322813" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Alexander Huszagh (He/Him) <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245322813">(Jul 08 2021 at 14:41)</a>:</h4>
<p>I believe the same is true for malloc, let me check now.</p>



<a name="245323257"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245323257" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Amanieu <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245323257">(Jul 08 2021 at 14:44)</a>:</h4>
<p>C also has the same requirement of allocations not wrapping around the address space. So taking a pointer+length from C code should be fine.</p>



<a name="245323360"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245323360" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Alexander Huszagh (He/Him) <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245323360">(Jul 08 2021 at 14:45)</a>:</h4>
<p>As of C11, <code>calloc</code> says in 7.22.3.2:</p>
<blockquote>
<p>The calloc function allocates space for an array of nmem bobjects, each of whose size issize. The space is initialized to all bits zero.</p>
</blockquote>
<p><code>malloc</code> says in 7.22.3.4:</p>
<blockquote>
<p>The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.</p>
</blockquote>
<p>So yep, can guarantee contiguous memory as of C11.</p>



<a name="245323577"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245323577" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Alexander Huszagh (He/Him) <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245323577">(Jul 08 2021 at 14:46)</a>:</h4>
<p>In short, as long as someone isn't doing something very weird, as long as the code or custom malloc is standards-conformant, if you can do in your code <code>while ptr &lt; array + size</code>, then you're guaranteed no UB.</p>



<a name="245335565"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245335565" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Elichai Turkel <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245335565">(Jul 08 2021 at 16:11)</a>:</h4>
<p>but if C gives me an mmapped file, then I think it could wrap the address space hypothetically?</p>



<a name="245336308"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245336308" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245336308">(Jul 08 2021 at 16:17)</a>:</h4>
<p>The kernel won't ever wrap around when mmapping AFAIK. The high half of the address space is reserved by the kernel itself on most OSes.</p>



<a name="245341528"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245341528" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Josh Triplett <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245341528">(Jul 08 2021 at 17:02)</a>:</h4>
<p><span class="user-mention silent" data-user-id="133247">bjorn3</span> <a href="#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space/near/245336308">said</a>:</p>
<blockquote>
<p>The kernel won't ever wrap around when mmapping AFAIK. The high half of the address space is reserved by the kernel itself on most OSes.</p>
</blockquote>
<p>That's true for some architectures. On some, the full address space belongs to userspace, from <code>0</code> to <code>usize::MAX</code>. It'd still be unusual to wrap, not least of which because the OS typically reserves the page at 0 to catch null pointers, but it's not portable to assume that the high half of the address space is reserved.</p>



<a name="245341634"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245341634" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Josh Triplett <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245341634">(Jul 08 2021 at 17:03)</a>:</h4>
<p>On x86-32, for instance, you can have a 3G/1G split (high quarter, not high half), or even a 4G/4G split. <a href="https://lwn.net/Articles/39283/">https://lwn.net/Articles/39283/</a></p>



<a name="245342799"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation%20wrapping%20around%20the%20address%20space/near/245342799" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Allocation.20wrapping.20around.20the.20address.20space.html#245342799">(Jul 08 2021 at 17:12)</a>:</h4>
<p>It is LLVM UB even in that case I think, so it shouldn't happen even for C code. At least when compiled with clang.</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>